Efficient evaluation of reachability query for directed acyclic XML graph based on a prime number labelling schema
نویسندگان
چکیده
omputer 012.10.00 Abstract Many schema labelling approaches have been designed to facilitate querying of XML documents. The proposed algorithms are based on the fact that ancestor–descendant relationships among nodes can be quickly determined. Schema labelling is a family of technologies widely used in indexing tree, graph, or structured XML graph, in which a unique identifier is assigned to each node in the tree/graph. The generated identifier is then used in indexing as a reference to the actual node so that structural relationship among the nodes can be quickly captured. In this paper, we extend the prime number schema labelling algorithm for labelling DAG XML graph. Our main contribution is scaling down the original XML graph size substantially based on the Strongly Connected Component (SCC) principles. Labelling each node in DAG with an integer that is the arithmetical multiplication of the prime number associating with the node and its parent label. The schema does not depend on spanning tree. Thus, subsumption hierarchies represented in a DAG can be efficiently explored by checking the divisibility among the labels. Also, it inherits dynamic update ability and compact size features from its predecessors. Our theoretical analysis and the experimental results showed that the generated labelled schema is an efficient and a scalable one for processing reachability queries on large XML graphs. 2012 Faculty of Computers and Information, Cairo University. Production and hosting by Elsevier B.V. All rights reserved.
منابع مشابه
Engineering an Efficient Reachability Algorithm for Directed Graphs
I declare that I have developed and written the enclosed thesis completely by myself, and have not used sources or means without declaration in the text. A reachability query on a directed graph G asks if there exists a path from a node s to a node t. Answering such queries on large graph like datasets has become an issue in various fields of research and real world applications over the past 2...
متن کاملSubgraph Join: Efficient Processing Subgraph Queries on Graph-Structured XML Document
The information in many applications can be naturally represented as graph-structured XML document. Structural query on graph structured XML document matches the subgraph of graph structured XML document on some given schema. The query processing of graphstructured XML document brings new challenges. In this paper, for the processing of subgraph query, we design a subgraph join algorithm based ...
متن کاملIndexing collections of XML documents with arbitrary links
______________________________________________________________________ In recent years, the popularity of XML has increased significantly. XML is the extensible markup language of the World Wide Web Consortium (W3C). XML is used to represent data in many areas, such as traditional database management systems, e-business environments, and the World Wide Web. XML data, unlike relational and objec...
متن کاملEfficient Graph Reachability Query Answering Using Tree Decomposition
Efficient reachability query answering in large directed graphs has been intensively investigated because of its fundamental importance in many application fields such as XML data processing, ontology reasoning and bioinformatics. In this paper, we present a novel indexing method based on the concept of tree decomposition. We show analytically that this intuitive approach is both time and space...
متن کاملLabeling RDF Graphs for Linear Time and Space Querying
Indices and data structures for web querying have mostly considered tree shaped data, reflecting the view of XML documents as tree-shaped. However, for RDF (and when querying ID/IDREF constraints in XML) data is indisputably graph-shaped. In this chapter, we first study existing indexing and labeling schemes for RDF and graph data in general with focus on support for efficient adjacency and rea...
متن کامل